NP Bracketing by Maximum Entropy Tagging and SVM Reranking
نویسندگان
چکیده
We perform Noun Phrase Bracketing by using a local, maximum entropy-based tagging model, which produces bracketing hypotheses. These hypotheses are subsequently fed into a reranking framework based on support vector machines. We solve the problem of hierarchical structure in our tagging model by modeling underspecified tags, which are fully determined only at decoding time. The tagging model performs comparably to competing approaches and the subsequent reranking increases our system’s performance from an f-score of 81.7 to 86.1, surpassing the best reported results to date of 83.8.
منابع مشابه
Forest Reranking through Subtree Ranking
We propose the subtree ranking approach to parse forest reranking which is a generalization of current perceptron-based reranking methods. For the training of the reranker, we extract competing local subtrees, hence the training instances (candidate subtree sets) are very similar to those used during beamsearch parsing. This leads to better parameter optimization. Another chief advantage of the...
متن کاملGreek Named Entity Recognition using Support Vector Machines, Maximum Entropy and Onetime
We describe our work on Greek Named Entity Recognition using comparatively three different machine learning techniques: (i) Support Vector Machines (SVM), (ii) Maximum Entropy and (iii) Onetime, a shortcut method based on previous work of one of the authors. The majority of our system’s features use linguistic knowledge provided by: morphology, punctuation, position of the lexical units within ...
متن کاملDependency-Based Bracketing Transduction Grammar for Statistical Machine Translation
In this paper, we propose a novel dependency-based bracketing transduction grammar for statistical machine translation, which converts a source sentence into a target dependency tree. Different from conventional bracketing transduction grammar models, we encode target dependency information into our lexical rules directly, and then we employ two different maximum entropy models to determine the...
متن کاملWord Lattice Reranking for Chinese Word Segmentation and Part-of-Speech Tagging
In this paper, we describe a new reranking strategy named word lattice reranking, for the task of joint Chinese word segmentation and part-of-speech (POS) tagging. As a derivation of the forest reranking for parsing (Huang, 2008), this strategy reranks on the pruned word lattice, which potentially contains much more candidates while using less storage, compared with the traditional n-best list ...
متن کاملVoted Approach for Part of Speech Tagging in Bengali
Part of Speech (POS) tagging is the task of labeling each word in a sentence with its appropriate syntactic category called part of speech. POS tagging is a very important preprocessing task for language processing activities. In this paper, we report about our work on POS tagging for Bengali by combining different POS tagging systems using three weighted voting techniques. The individual POS t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004